rand 0
Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages
Ghosh, Poulami, Dabre, Raj, Bhattacharyya, Pushpak
Pre-trained language models (PLMs) are known to be susceptible to perturbations to the input text, but existing works do not explicitly focus on linguistically grounded attacks, which are subtle and more prevalent in nature. In this paper, we study whether PLMs are agnostic to linguistically grounded attacks or not. To this end, we offer the first study addressing this, investigating different Indic languages and various downstream tasks. Our findings reveal that although PLMs are susceptible to linguistic perturbations, when compared to non-linguistic attacks, PLMs exhibit a slightly lower susceptibility to linguistic attacks. This highlights that even constrained attacks are effective. Moreover, we investigate the implications of these outcomes across a range of languages, encompassing diverse language families and different scripts.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > India (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (4 more...)
- Information Technology > Security & Privacy (0.69)
- Government > Military (0.47)
Cost-Efficient Subjective Task Annotation and Modeling through Few-Shot Annotator Adaptation
Golazizian, Preni, Omrani, Ali, Ziabari, Alireza S., Dehghani, Morteza
In subjective NLP tasks, where a single ground truth does not exist, the inclusion of diverse annotators becomes crucial as their unique perspectives significantly influence the annotations. In realistic scenarios, the annotation budget often becomes the main determinant of the number of perspectives (i.e., annotators) included in the data and subsequent modeling. We introduce a novel framework for annotation collection and modeling in subjective tasks that aims to minimize the annotation budget while maximizing the predictive performance for each annotator. Our framework has a two-stage design: first, we rely on a small set of annotators to build a multitask model, and second, we augment the model for a new perspective by strategically annotating a few samples per annotator. To test our framework at scale, we introduce and release a unique dataset, Moral Foundations Subjective Corpus, of 2000 Reddit posts annotated by 24 annotators for moral sentiment. We demonstrate that our framework surpasses the previous SOTA in capturing the annotators' individual perspectives with as little as 25% of the original annotation budget on two datasets. Furthermore, our framework results in more equitable models, reducing the performance disparity among annotators.
- North America > United States > California (0.14)
- Europe > United Kingdom > England (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks
Sun, Kaiser, Williams, Adina, Hupkes, Dieuwke
NLP models have progressed drastically in recent years, according to numerous datasets proposed to evaluate performance. Questions remain, however, about how particular dataset design choices may impact the conclusions we draw about model capabilities. In this work, we investigate this question in the domain of compositional generalization. We examine the performance of six modeling approaches across 4 datasets, split according to 8 compositional splitting strategies, ranking models by 18 compositional generalization splits in total. Our results show that: i) the datasets, although all designed to evaluate compositional generalization, rank modeling approaches differently; ii) datasets generated by humans align better with each other than they with synthetic datasets, or than synthetic datasets among themselves; iii) generally, whether datasets are sampled from the same source is more predictive of the resulting model ranking than whether they maintain the same interpretation of compositionality; and iv) which lexical items are used in the data can strongly impact conclusions. Overall, our results demonstrate that much work remains to be done when it comes to assessing whether popular evaluation datasets measure what they intend to measure, and suggest that elucidating more rigorous standards for establishing the validity of evaluation sets could benefit the field.
- North America > Dominican Republic (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)
Estimating Basis Functions in Massive Fields under the Spatial Mixed Effects Model
Pazdernik, Karl T., Maitra, Ranjan
Spatial prediction is commonly achieved under the assumption of a Gaussian random field (GRF) by obtaining maximum likelihood estimates of parameters, and then using the kriging equations to arrive at predicted values. For massive datasets, fixed rank kriging using the Expectation-Maximization (EM) algorithm for estimation has been proposed as an alternative to the usual but computationally prohibitive kriging method. The method reduces computation cost of estimation by redefining the spatial process as a linear combination of basis functions and spatial random effects. A disadvantage of this method is that it imposes constraints on the relationship between the observed locations and the knots. We develop an alternative method that utilizes the Spatial Mixed Effects (SME) model, but allows for additional flexibility by estimating the range of the spatial dependence between the observations and the knots via an Alternating Expectation Conditional Maximization (AECM) algorithm. Experiments show that our methodology improves estimation without sacrificing prediction accuracy while also minimizing the additional computational burden of extra parameter estimation. The methodology is applied to a temperature data set archived by the United States National Climate Data Center, with improved results over previous methodology.
- Europe > Austria > Vienna (0.14)
- North America > United States > California (0.04)
- North America > United States > Washington > Benton County > Richland (0.04)
- (10 more...)
- Food & Agriculture (0.92)
- Government > Regional Government > North America Government > United States Government (0.92)
- Energy (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Efficient Neural Network Robustness Certification with General Activation Functions
Zhang, Huan, Weng, Tsui-Wei, Chen, Pin-Yu, Hsieh, Cho-Jui, Daniel, Luca
Finding minimum distortion of adversarial examples and thus certifying robustness in neural network classifiers for given data points is known to be a challenging problem. Nevertheless, recently it has been shown to be possible to give a non-trivial certified lower bound of minimum adversarial distortion, and some recent progress has been made towards this direction by exploiting the piece-wise linear nature of ReLU activations. However, a generic robustness certification for general activation functions still remains largely unexplored. To address this issue, in this paper we introduce CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points. The novelty in our algorithm consists of bounding a given activation function with linear and quadratic functions, hence allowing it to tackle general activation functions including but not limited to four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we facilitate the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation. Experimental results show that CROWN on ReLU networks can notably improve the certified lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while having comparable computational efficiency. Furthermore, CROWN also demonstrates its effectiveness and flexibility on networks with general activation functions, including tanh, sigmoid and arctan.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Asia (0.04)